Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24.034
Filtrar
1.
Genome Med ; 16(1): 50, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38566210

RESUMO

BACKGROUND: Mitochondria play essential roles in tumorigenesis; however, little is known about the contribution of mitochondrial DNA (mtDNA) to esophageal squamous cell carcinoma (ESCC). Whole-genome sequencing (WGS) is by far the most efficient technology to fully characterize the molecular features of mtDNA; however, due to the high redundancy and heterogeneity of mtDNA in regular WGS data, methods for mtDNA analysis are far from satisfactory. METHODS: Here, we developed a likelihood-based method dMTLV to identify low-heteroplasmic mtDNA variants. In addition, we described fNUMT, which can simultaneously detect non-reference nuclear sequences of mitochondrial origin (non-ref NUMTs) and their derived artifacts. Using these new methods, we explored the contribution of mtDNA to ESCC utilizing the multi-omics data of 663 paired tumor-normal samples. RESULTS: dMTLV outperformed the existing methods in sensitivity without sacrificing specificity. The verification using Nanopore long-read sequencing data showed that fNUMT has superior specificity and more accurate breakpoint identification than the current methods. Leveraging the new method, we identified a significant association between the ESCC overall survival and the ratio of mtDNA copy number of paired tumor-normal samples, which could be potentially explained by the differential expression of genes enriched in pathways related to metabolism, DNA damage repair, and cell cycle checkpoint. Additionally, we observed that the expression of CBWD1 was downregulated by the non-ref NUMTs inserted into its intron region, which might provide precursor conditions for the tumor cells to adapt to a hypoxic environment. Moreover, we identified a strong positive relationship between the number of mtDNA truncating mutations and the contribution of signatures linked to tumorigenesis and treatment response. CONCLUSIONS: Our new frameworks promote the characterization of mtDNA features, which enables the elucidation of the landscapes and roles of mtDNA in ESCC essential for extending the current understanding of ESCC etiology. dMTLV and fNUMT are freely available from https://github.com/sunnyzxh/dMTLV and https://github.com/sunnyzxh/fNUMT , respectively.


Assuntos
Neoplasias Esofágicas , Carcinoma de Células Escamosas do Esôfago , Humanos , Carcinoma de Células Escamosas do Esôfago/genética , DNA Mitocondrial/genética , DNA Mitocondrial/análise , DNA Mitocondrial/metabolismo , Neoplasias Esofágicas/genética , Neoplasias Esofágicas/metabolismo , Neoplasias Esofágicas/patologia , Funções Verossimilhança , Mitocôndrias/genética , Carcinogênese
2.
Biometrics ; 80(2)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38563532

RESUMO

Deep learning has continuously attained huge success in diverse fields, while its application to survival data analysis remains limited and deserves further exploration. For the analysis of current status data, a deep partially linear Cox model is proposed to circumvent the curse of dimensionality. Modeling flexibility is attained by using deep neural networks (DNNs) to accommodate nonlinear covariate effects and monotone splines to approximate the baseline cumulative hazard function. We establish the convergence rate of the proposed maximum likelihood estimators. Moreover, we derive that the finite-dimensional estimator for treatment covariate effects is $\sqrt{n}$-consistent, asymptotically normal, and attains semiparametric efficiency. Finally, we demonstrate the performance of our procedures through extensive simulation studies and application to real-world data on news popularity.


Assuntos
Modelos de Riscos Proporcionais , Funções Verossimilhança , Análise de Sobrevida , Simulação por Computador , Modelos Lineares
3.
Biom J ; 66(3): e2300238, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38581103

RESUMO

In a two-way additive analysis of variance (ANOVA) model, we consider the problem of testing for homogeneity of both row and column effects against their simultaneous ordering. The error variances are assumed to be heterogeneous with unbalanced samples in each cell. Two simultaneous test procedures are developed-the first one using the likelihood ratio test (LRT) statistics of two independent hypotheses and another based on the consecutive pairwise differences of estimators of effects. The parametric bootstrap (PB) approach is used to find critical points of both the tests and the asymptotic accuracy of the bootstrap is established. An extensive simulation study shows that the proposed tests achieve the nominal size and have very good power performance. The robustness of the tests is also analyzed under deviation from normality. An "R" package is developed and shared on "GitHub" for ease of implementation of users. The proposed tests are illustrated using a real data set on the mortality due to alcoholic liver disease and it is shown that age and gender have a significant impact on the increasing incidence of mortality.


Assuntos
Modelos Estatísticos , Análise de Variância , Simulação por Computador , Funções Verossimilhança
4.
Virol J ; 21(1): 84, 2024 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-38600521

RESUMO

BACKGROUND: PlMERS-CoV is a coronavirus known to cause severe disease in humans, taxonomically classified under the subgenus Merbecovirus. Recent findings showed that the close relatives of MERS-CoV infecting vespertillionid bats (family Vespertillionidae), named NeoCoV and PDF-2180, use their hosts' ACE2 as their entry receptor, unlike the DPP4 receptor usage of MERS-CoV. Previous research suggests that this difference in receptor usage between these related viruses is a result of recombination. However, the precise location of the recombination breakpoints and the details of the recombination event leading to the change of receptor usage remain unclear. METHODS: We used maximum likelihood-based phylogenetics and genetic similarity comparisons to characterise the evolutionary history of all complete Merbecovirus genome sequences. Recombination events were detected by multiple computational methods implemented in the recombination detection program. To verify the influence of recombination, we inferred the phylogenetic relation of the merbecovirus genomes excluding recombinant segments and that of the viruses' receptor binding domains and examined the level of congruency between the phylogenies. Finally, the geographic distribution of the genomes was inspected to identify the possible location where the recombination event occurred. RESULTS: Similarity plot analysis and the recombination-partitioned phylogenetic inference showed that MERS-CoV is highly similar to NeoCoV (and PDF-2180) across its whole genome except for the spike-encoding region. This is confirmed to be due to recombination by confidently detecting a recombination event between the proximal ancestor of MERS-CoV and a currently unsampled merbecovirus clade. Notably, the upstream recombination breakpoint was detected in the N-terminal domain and the downstream breakpoint at the S2 subunit of spike, indicating that the acquired recombined fragment includes the receptor-binding domain. A tanglegram comparison further confirmed that the receptor binding domain-encoding region of MERS-CoV was acquired via recombination. Geographic mapping analysis on sampling sites suggests the possibility that the recombination event occurred in Africa. CONCLUSION: Together, our results suggest that recombination can lead to receptor switching of merbecoviruses during circulation in bats. These results are useful for future epidemiological assessments and surveillance to understand the spillover risk of bat coronaviruses to the human population.


Assuntos
Quirópteros , Infecções por Coronavirus , Coronavírus da Síndrome Respiratória do Oriente Médio , Animais , Humanos , Coronavírus da Síndrome Respiratória do Oriente Médio/genética , Filogenia , Funções Verossimilhança , Infecções por Coronavirus/veterinária , Infecções por Coronavirus/epidemiologia , Recombinação Genética , Glicoproteína da Espícula de Coronavírus/genética , Glicoproteína da Espícula de Coronavírus/metabolismo
5.
Stat Med ; 43(9): 1671-1687, 2024 Apr 30.
Artigo em Inglês | MEDLINE | ID: mdl-38634251

RESUMO

We consider estimation of the semiparametric additive hazards model with an unspecified baseline hazard function where the effect of a continuous covariate has a specific shape but otherwise unspecified. Such estimation is particularly useful for a unimodal hazard function, where the hazard is monotone increasing and monotone decreasing with an unknown mode. A popular approach of the proportional hazards model is limited in such setting due to the complicated structure of the partial likelihood. Our model defines a quadratic loss function, and its simple structure allows a global Hessian matrix that does not involve parameters. Thus, once the global Hessian matrix is computed, a standard quadratic programming method can be applicable by profiling all possible locations of the mode. However, the quadratic programming method may be inefficient to handle a large global Hessian matrix in the profiling algorithm due to a large dimensionality, where the dimension of the global Hessian matrix and number of hypothetical modes are the same order as the sample size. We propose the quadratic pool adjacent violators algorithm to reduce computational costs. The proposed algorithm is extended to the model with a time-dependent covariate with monotone or U-shape hazard function. In simulation studies, our proposed method improves computational speed compared to the quadratic programming method, with bias and mean square error reductions. We analyze data from a recent cardiovascular study.


Assuntos
Algoritmos , Humanos , Modelos de Riscos Proporcionais , Simulação por Computador , Probabilidade , Viés , Funções Verossimilhança
6.
Biometrics ; 80(1)2024 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-38465986

RESUMO

This paper proposes a novel likelihood-based boosting method for the selection of the random effects in linear mixed models. The nonconvexity of the objective function to minimize, which is the negative profile log-likelihood, requires the adoption of new solutions. In this respect, our optimization approach also employs the directions of negative curvature besides the usual Newton directions. A simulation study and a real-data application show the good performance of the proposal.


Assuntos
Funções Verossimilhança , Modelos Lineares , Simulação por Computador
7.
Biometrics ; 80(1)2024 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-38465988

RESUMO

Mixed panel count data represent a common complex data structure in longitudinal survey studies. A major challenge in analyzing such data is variable selection and estimation while efficiently incorporating both the panel count and panel binary data components. Analyses in the medical literature have often ignored the panel binary component and treated it as missing with the unknown panel counts, while obviously such a simplification does not effectively utilize the original data information. In this research, we put forward a penalized likelihood variable selection and estimation procedure under the proportional mean model. A computationally efficient EM algorithm is developed that ensures sparse estimation for variable selection, and the resulting estimator is shown to have the desirable oracle property. Simulation studies assessed and confirmed the good finite-sample properties of the proposed method, and the method is applied to analyze a motivating dataset from the Health and Retirement Study.


Assuntos
Algoritmos , Funções Verossimilhança , Simulação por Computador , Estudos Longitudinais
8.
Lifetime Data Anal ; 30(2): 472-500, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38436831

RESUMO

In clinical studies, one often encounters time-to-event data that are subject to right censoring and for which a fraction of the patients under study never experience the event of interest. Such data can be modeled using cure models in survival analysis. In the presence of cure fraction, the mixture cure model is popular, since it allows to model probability to be cured (called the incidence) and the survival function of the uncured individuals (called the latency). In this paper, we develop a variable selection procedure for the incidence and latency parts of a mixture cure model, consisting of a logistic model for the incidence and a semiparametric accelerated failure time model for the latency. We use a penalized likelihood approach, based on adaptive LASSO penalties for each part of the model, and we consider two algorithms for optimizing the criterion function. Extensive simulations are carried out to assess the accuracy of the proposed selection procedure. Finally, we employ the proposed method to a real dataset regarding heart failure patients with left ventricular systolic dysfunction.


Assuntos
Algoritmos , Modelos Estatísticos , Humanos , Funções Verossimilhança , Análise de Sobrevida , Modelos Logísticos , Simulação por Computador
9.
Cereb Cortex ; 34(3)2024 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-38466117

RESUMO

Speech disorders are associated with different degrees of functional and structural abnormalities. However, the abnormalities associated with specific disorders, and the common abnormalities shown by all disorders, remain unclear. Herein, a meta-analysis was conducted to integrate the results of 70 studies that compared 1843 speech disorder patients (dysarthria, dysphonia, stuttering, and aphasia) to 1950 healthy controls in terms of brain activity, functional connectivity, gray matter, and white matter fractional anisotropy. The analysis revealed that compared to controls, the dysarthria group showed higher activity in the left superior temporal gyrus and lower activity in the left postcentral gyrus. The dysphonia group had higher activity in the right precentral and postcentral gyrus. The stuttering group had higher activity in the right inferior frontal gyrus and lower activity in the left inferior frontal gyrus. The aphasia group showed lower activity in the bilateral anterior cingulate gyrus and left superior frontal gyrus. Across the four disorders, there were concurrent lower activity, gray matter, and fractional anisotropy in motor and auditory cortices, and stronger connectivity between the default mode network and frontoparietal network. These findings enhance our understanding of the neural basis of speech disorders, potentially aiding clinical diagnosis and intervention.


Assuntos
Afasia , Córtex Auditivo , Disfonia , Gagueira , Humanos , Disartria , Funções Verossimilhança , Distúrbios da Fala
10.
Biometrics ; 80(2)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38536746

RESUMO

The paper extends the empirical likelihood (EL) approach of Liu et al. to a new and very flexible family of latent class models for capture-recapture data also allowing for serial dependence on previous capture history, conditionally on latent type and covariates. The EL approach allows to estimate the overall population size directly rather than by adding estimates conditional to covariate configurations. A Fisher-scoring algorithm for maximum likelihood estimation is proposed and a more efficient alternative to the traditional EL approach for estimating the non-parametric component is introduced; this allows us to show that the mapping between the non-parametric distribution of the covariates and the probabilities of being never captured is one-to-one and strictly increasing. Asymptotic results are outlined, and a procedure for constructing profile likelihood confidence intervals for the population size is presented. Two examples based on real data are used to illustrate the proposed approach and a simulation study indicates that, when estimating the overall undercount, the method proposed here is substantially more efficient than the one based on conditional maximum likelihood estimation, especially when the sample size is not sufficiently large.


Assuntos
Modelos Estatísticos , Funções Verossimilhança , Simulação por Computador , Densidade Demográfica , Tamanho da Amostra
11.
Bull Math Biol ; 86(4): 40, 2024 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-38489047

RESUMO

Use of nonlinear statistical methods and models are ubiquitous in scientific research. However, these methods may not be fully understood, and as demonstrated here, commonly-reported parameter p-values and confidence intervals may be inaccurate. The gentle introduction to nonlinear regression modelling and comprehensive illustrations given here provides applied researchers with the needed overview and tools to appreciate the nuances and breadth of these important methods. Since these methods build upon topics covered in first and second courses in applied statistics and predictive modelling, the target audience includes practitioners and students alike. To guide practitioners, we summarize, illustrate, develop, and extend nonlinear modelling methods, and underscore caveats of Wald statistics using basic illustrations and give key reasons for preferring likelihood methods. Parameter profiling in multiparameter models and exact or near-exact versus approximate likelihood methods are discussed and curvature measures are connected with the failure of the Wald approximations regularly used in statistical software. The discussion in the main paper has been kept at an introductory level and it can be covered on a first reading; additional details given in the Appendices can be worked through upon further study. The associated online Supplementary Information also provides the data and R computer code which can be easily adapted to aid researchers to fit nonlinear models to their data.


Assuntos
Modelos Biológicos , Dinâmica não Linear , Humanos , Simulação por Computador , Conceitos Matemáticos , Funções Verossimilhança , Modelos Estatísticos
12.
Epidemiology ; 35(3): 295-307, 2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-38465940

RESUMO

Understanding the incidence of disease is often crucial for public policy decision-making, as observed during the COVID-19 pandemic. Estimating incidence is challenging, however, when the definition of incidence relies on tests that imperfectly measure disease, as in the case when assays with variable performance are used to detect the SARS-CoV-2 virus. To our knowledge, there are no pragmatic methods to address the bias introduced by the performance of labs in testing for the virus. In the setting of a longitudinal study, we developed a maximum likelihood estimation-based approach to estimate laboratory performance-adjusted incidence using the expectation-maximization algorithm. We constructed confidence intervals (CIs) using both bootstrapped-based and large-sample interval estimator approaches. We evaluated our methods through extensive simulation and applied them to a real-world study (TrackCOVID), where the primary goal was to determine the incidence of and risk factors for SARS-CoV-2 infection in the San Francisco Bay Area from July 2020 to March 2021. Our simulations demonstrated that our method converged rapidly with accurate estimates under a variety of scenarios. Bootstrapped-based CIs were comparable to the large-sample estimator CIs with a reasonable number of incident cases, shown via a simulation scenario based on the real TrackCOVID study. In more extreme simulated scenarios, the coverage of large-sample interval estimation outperformed the bootstrapped-based approach. Results from the application to the TrackCOVID study suggested that assuming perfect laboratory test performance can lead to an inaccurate inference of the incidence. Our flexible, pragmatic method can be extended to a variety of disease and study settings.


Assuntos
COVID-19 , Pandemias , Humanos , Funções Verossimilhança , Incidência , Estudos Longitudinais , Simulação por Computador , COVID-19/epidemiologia
13.
Am J Hum Genet ; 111(4): 654-667, 2024 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-38471507

RESUMO

Allele-specific methylation (ASM) is an epigenetic modification whereby one parental allele becomes methylated and the other unmethylated at a specific locus. ASM is most often driven by the presence of nearby heterozygous variants that influence methylation, but also occurs somatically in the context of genomic imprinting. In this study, we investigate ASM using publicly available single-cell reduced representation bisulfite sequencing (scRRBS) data on 608 B cells sampled from six healthy B cell samples and 1,230 cells from 11 chronic lymphocytic leukemia (CLL) samples. We developed a likelihood-based criterion to test whether a CpG exhibited ASM, based on the distributions of methylated and unmethylated reads both within and across cells. Applying our likelihood ratio test, 65,998 CpG sites exhibited ASM in healthy B cell samples according to a Bonferroni criterion (p < 8.4 × 10-9), and 32,862 CpG sites exhibited ASM in CLL samples (p < 8.5 × 10-9). We also called ASM at the sample level. To evaluate the accuracy of our method, we called heterozygous variants from the scRRBS data, which enabled variant-based calls of ASM within each cell. Comparing sample-level ASM calls to the variant-based measures of ASM, we observed a positive predictive value of 76%-100% across samples. We observed high concordance of ASM across samples and an overrepresentation of ASM in previously reported imprinted genes and genes with imprinting binding motifs. Our study demonstrates that single-cell bisulfite sequencing is a potentially powerful tool to investigate ASM, especially as studies expand to increase the number of samples and cells sequenced.


Assuntos
Metilação de DNA , Leucemia Linfocítica Crônica de Células B , Sulfitos , Humanos , Metilação de DNA/genética , Alelos , Leucemia Linfocítica Crônica de Células B/genética , Funções Verossimilhança , Impressão Genômica/genética , Ilhas de CpG/genética
14.
J R Soc Interface ; 21(212): 20230607, 2024 03.
Artigo em Inglês | MEDLINE | ID: mdl-38442862

RESUMO

When employing mechanistic models to study biological phenomena, practical parameter identifiability is important for making accurate predictions across wide ranges of unseen scenarios, as well as for understanding the underlying mechanisms. In this work, we use a profile-likelihood approach to investigate parameter identifiability for four extensions of the Fisher-Kolmogorov-Petrovsky-Piskunov (Fisher-KPP) model, given experimental data from a cell invasion assay. We show that more complicated models tend to be less identifiable, with parameter estimates being more sensitive to subtle differences in experimental procedures, and that they require more data to be practically identifiable. As a result, we suggest that parameter identifiability should be considered alongside goodness-of-fit and model complexity as criteria for model selection.


Assuntos
Mustelidae , Animais , Funções Verossimilhança , Projetos de Pesquisa
15.
Genome Biol Evol ; 16(4)2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38518756

RESUMO

Ancestral reconstruction is a widely used technique that has been applied to understand the evolutionary history of gain and loss of gene families. Ancestral gene content can be reconstructed via different phylogenetic methods, but many current and previous studies employ Dollo parsimony. We hypothesize that Dollo parsimony is not appropriate for ancestral gene content reconstruction inferences based on sequence homology, as Dollo parsimony is derived from the assumption that a complex character cannot be regained. This premise does not accurately model molecular sequence evolution, in which false orthology can result from sequence convergence or lateral gene transfer. The aim of this study is to test Dollo parsimony's suitability for ancestral gene content reconstruction and to compare its inferences with a maximum likelihood-based approach that allows a gene family to be gained more than once within a tree. We first compared the performance of the two approaches on a series of artificial data sets each of 5,000 genes that were simulated according to a spectrum of evolutionary rates without gene gain or loss, so that inferred deviations from the true gene count would arise only from errors in orthology inference and ancestral reconstruction. Next, we reconstructed protein domain evolution on a phylogeny representing known eukaryotic diversity. We observed that Dollo parsimony produced numerous ancestral gene content overestimations, especially at nodes closer to the root of the tree. These observations led us to the conclusion that, confirming our hypothesis, Dollo parsimony is not an appropriate method for ancestral reconstruction studies based on sequence homology.


Assuntos
Evolução Molecular , Filogenia , Funções Verossimilhança
16.
Environ Sci Pollut Res Int ; 31(14): 21073-21088, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38381289

RESUMO

This paper aims to create a new probability distribution and conducts statistical analysis on air quality dataset from Kathmandu. Using this innovative distribution, we have studied the ground reality of air quality conditions of Kathmandu, Nepal. In our research, we have developed a new probability distribution known as the New Extended Kumaraswamy Exponential Distribution by introducing an additional shape parameter to the Extended Kumaraswamy Exponential (EKwE) Distribution. Statistical characteristics such as cumulative distribution function, probability density function, hazard function, reversed hazard function, skewness, kurtosis, survival function, and hazard rate function are studied. The suggested model is non-normal and positively skewed with increasing and inverted bathtub-shaped hazard rate curves. To assess the model's suitability, we utilized a real dataset comprising air quality data from Kathmandu, Nepal, during the year 2021. Study shows that the air quality data exhibit an increasing failure rate, but the P2.5, P10, and total suspended particle concentrations exhibited its lowest levels during the monsoon season and its highest levels during the winter season. Parameters of the model are estimated by using the least square estimation (LSE), maximum likelihood estimation (MLE), and Cramér-von Mises (CVM) approach for P10 at Ratnapark Station, Kathmandu. To assess the model's validity, P-P plots and Q-Q plots are employed. Model comparisons are carried out using Akaike Information Criterion (AIC), Corrected Akaike Information Criterion (CAIC), Bayesian Information Criterion (BIC), and Hannan-Quinn Information Criterion (HQIC). Furthermore, the goodness of fit of the proposed model is evaluated using test statistics such as Anderson-Darling (A2) test, Cramér-von Mises (CVM) test, and the Kolmogorov-Smirnov (KS) test along with their respective p-values. From the findings, we have found that the air quality status of Kathmandu, Nepal, was found to be poor. Proposed distribution provides a better fit with greater flexibility for forecasting air quality data and conducting reliability data analyses. Dataset is analyzed and visualized using R programming.


Assuntos
Poluição do Ar , Teorema de Bayes , Nepal , Reprodutibilidade dos Testes , Poluição do Ar/análise , Funções Verossimilhança
17.
BMC Vet Res ; 20(1): 54, 2024 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-38347572

RESUMO

Free-living amoebae (FLA) are capable of inhabiting diverse reservoirs independently, without relying on a host organism, hence their designation as "free-living". The majority of amoebae that infect freshwater or marine fish are amphizoic, or free-living forms that may colonize fish under particular circumstances. Symphysodon aequifasciatus, commonly referred to as the discus, is widely recognized as a popular ornamental fish species. The primary objective of the present study was to determine the presence of pathogenic free-living amoebae (FLA) in samples of discus fish. Fish exhibiting clinical signs, sourced from various fish farms, were transferred to the ornamental fish clinic. The skin, gills, and intestinal mucosa of the fish were collected and subjected to culturing on plates containing a 1% non-nutrient agar medium. The detection of FLA was conducted through morphological, histopathological and molecular methods. The construction of the phylogenetic tree for Acanthamoeba genotypes was achieved using the maximum likelihood approach. The molecular sequence analysis revealed that all cultures that tested positive for FLA were T4 genotype of Acanthamoeba and Acanthamoeba sp. The examination of gill samples using histopathological methods demonstrated the presence of lamellar epithelial hyperplasia, significant fusion of secondary lamellae, and infiltration of inflammatory cells. A multitude of cysts, varying in shape from circular to elliptical, were observed within the gills. The occurrence of interlamellar vesicles and amoeboid organisms could be observed within the epithelial tissue of the gills. In the current study, presence of the Acanthamoeba T4 genotype on the skin and gills of discus fish exhibiting signs of illness in freshwater ornamental fish farms was identified. This observation suggests the potential of a transmission of amoebic infection from ornamental fish to humans, thereby highlighting the need for further investigation into this infection among ornamental fish maintained as pets, as well as individuals who interact with them and their environment.


Assuntos
Acanthamoeba , Amoeba , Ciclídeos , Humanos , Animais , Amoeba/genética , Filogenia , Irã (Geográfico)/epidemiologia , Funções Verossimilhança , Acanthamoeba/genética
18.
Genet Sel Evol ; 56(1): 11, 2024 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-38321371

RESUMO

BACKGROUND: The study of ancestral alleles provides insights into the evolutionary history, selection, and genetic structures of a population. In cattle, ancestral alleles are widely used in genetic analyses, including the detection of signatures of selection, determination of breed ancestry, and identification of admixture. Having a comprehensive list of ancestral alleles is expected to improve the accuracy of these genetic analyses. However, the list of ancestral alleles in cattle, especially at the whole genome sequence level, is far from complete. In fact, the current largest list of ancestral alleles (~ 42 million) represents less than 28% of the total number of detected variants in cattle. To address this issue and develop a genomic resource for evolutionary studies, we determined ancestral alleles in cattle by comparing prior derived whole-genome sequence variants to an out-species group using a population-based likelihood ratio test. RESULTS: Our study determined and makes available the largest list of ancestral alleles in cattle to date (70.1 million) and includes 2.3 million on the X chromosome. There was high concordance (97.6%) of the determined ancestral alleles with those from previous studies when only high-probability ancestral alleles were considered (29.8 million positions) and another 23.5 million high-confidence ancestral alleles were novel, expanding the available reference list to improve the accuracies of genetic analyses involving ancestral alleles. The high concordance of the results with previous studies implies that our approach using genomic sequence variants and a likelihood ratio test to determine ancestral alleles is appropriate. CONCLUSIONS: Considering the high concordance of ancestral alleles across studies, the ancestral alleles determined in this study including those not previously listed, particularly those with high-probability estimates, may be used for further genetic analyses with reasonable accuracy. Our approach that used predetermined variants in species and the likelihood ratio test to determine ancestral alleles is applicable to other species for which sequence level genotypes are available.


Assuntos
Estudo de Associação Genômica Ampla , Genômica , Bovinos , Animais , Alelos , Funções Verossimilhança , Genótipo , Genômica/métodos , Polimorfismo de Nucleotídeo Único
19.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38324624

RESUMO

Connections between circular RNAs (circRNAs) and microRNAs (miRNAs) assume a pivotal position in the onset, evolution, diagnosis and treatment of diseases and tumors. Selecting the most potential circRNA-related miRNAs and taking advantage of them as the biological markers or drug targets could be conducive to dealing with complex human diseases through preventive strategies, diagnostic procedures and therapeutic approaches. Compared to traditional biological experiments, leveraging computational models to integrate diverse biological data in order to infer potential associations proves to be a more efficient and cost-effective approach. This paper developed a model of Convolutional Autoencoder for CircRNA-MiRNA Associations (CA-CMA) prediction. Initially, this model merged the natural language characteristics of the circRNA and miRNA sequence with the features of circRNA-miRNA interactions. Subsequently, it utilized all circRNA-miRNA pairs to construct a molecular association network, which was then fine-tuned by labeled samples to optimize the network parameters. Finally, the prediction outcome is obtained by utilizing the deep neural networks classifier. This model innovatively combines the likelihood objective that preserves the neighborhood through optimization, to learn the continuous feature representation of words and preserve the spatial information of two-dimensional signals. During the process of 5-fold cross-validation, CA-CMA exhibited exceptional performance compared to numerous prior computational approaches, as evidenced by its mean area under the receiver operating characteristic curve of 0.9138 and a minimal SD of 0.0024. Furthermore, recent literature has confirmed the accuracy of 25 out of the top 30 circRNA-miRNA pairs identified with the highest CA-CMA scores during case studies. The results of these experiments highlight the robustness and versatility of our model.


Assuntos
MicroRNAs , Neoplasias , Humanos , MicroRNAs/genética , RNA Circular/genética , Funções Verossimilhança , Redes Neurais de Computação , Neoplasias/genética , Biologia Computacional/métodos
20.
Theor Popul Biol ; 156: 93-102, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38367870

RESUMO

Given a labeled tree topology t, consider a population P of k leaves chosen among those of t. The clade of P is the minimal subtree of t containing P and its size is given by the number of leaves in the clade. When t is selected under the Yule or uniform distribution among the labeled topologies of size n, we study the "clade size" random variable determining closed formulas for its probability mass function, its mean, and its variance. Our calculations show that for large n the clade size tends to be smaller under the uniform model than under the Yule model, with a larger variability in the first scenario for values of k≥5. We apply our probability formulas to investigate set-theoretic relationships between the clades of two populations in a random tree, determining how likely one clade is contained in or it is equal to the other. Our study relates to earlier calculations for the probability that under the Yule model the clade size of P equals the size of P - that is, the population P forms a monophyletic group - and extends known results for the probability that the minimal (non-trivial) clade containing a random taxon has a given size.


Assuntos
Deriva Genética , Modelos Genéticos , Filogenia , Funções Verossimilhança
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...